Poisson Distribution
Poisson Distribution: Definition and Characteristics
Definition
The **Poisson distribution**, named after the French mathematician Siméon Denis Poisson, is a discrete probability distribution. It is used to model the probability of a specific number of events occurring in a **fixed interval of time, space, or other dimension**, given that these events happen with a known constant average rate and independently of the time or location of the previous event.
It is commonly used to describe the number of occurrences of a relatively rare event in a large number of trials or over a continuous observation period, provided the conditions for its application are met.
Characteristics and Assumptions
A random variable $X$ is said to follow a Poisson distribution if it represents the number of occurrences of an event over a fixed interval, and the process generating these events satisfies the following conditions (often referred to as the assumptions of a Poisson process):
-
Independence of Occurrences:
The occurrence of an event in one part of the interval (or in one subinterval) does not affect the probability of the event occurring in any other disjoint part of the interval. Events happen independently.
-
Constant Average Rate ($\lambda$):
The events occur at a constant average rate. This average number of occurrences in the specified interval is denoted by $\lambda$ (lambda). $\lambda$ is the key parameter of the Poisson distribution. This rate must be consistent throughout the interval.
-
Rareness (or Non-simultaneity):
The probability of more than one event occurring in a very small subinterval is negligible (close to zero). Events occur one at a time; they do not occur simultaneously at the exact same point in time or space.
-
Proportionality:
The probability of an event occurring in a very small subinterval is proportional to the length or size of the subinterval.
The random variable $X$ representing the number of occurrences in the fixed interval can take on any non-negative integer value: $0, 1, 2, 3, \dots$. The sample space for $X$ is $\{0, 1, 2, \dots\}$, which is countably infinite.
Examples of Phenomena Modeled by Poisson Distribution
The Poisson distribution is useful for modeling count data over a specified interval when the events are relatively rare and occur randomly with a constant average rate.
- The number of emergency calls received by a fire station in a specific hour.
- The number of defects in a roll of fabric of a standard length.
- The number of customers arriving at a shop during a 15-minute interval.
- The number of traffic accidents at a particular intersection during a month.
- The number of mutations in a given segment of DNA after exposure to radiation.
- The number of particles emitted by a radioactive source in a specified time period.
- The number of typos per page in a book.
In all these examples, we are counting discrete events occurring randomly over a defined continuum (time, length, area, etc.).
Probability Mass Function of Poisson Distribution
Definition
For a discrete random variable $X$ that follows a Poisson distribution with an average rate (or mean) of $\lambda$ events per interval, denoted as $X \sim \text{Poisson}(\lambda)$, its probability distribution is defined by its Probability Mass Function (PMF). The PMF gives the probability of observing exactly $k$ events in the fixed interval.
Formula for the Probability Mass Function (PMF)
The probability of observing exactly $k$ occurrences of the event in the fixed interval is given by the Poisson probability formula:
$$P(X=k) = \frac{e^{-\lambda} \lambda^k}{k!}$$
... (1)
This formula is valid for $k = 0, 1, 2, 3, \dots$ (all non-negative integers).
Where:
- $k$: The specific number of occurrences (events) whose probability we want to find. $k$ must be a non-negative integer.
- $\lambda$ (lambda): The average number of occurrences (the mean) in the specified fixed interval. $\lambda$ is the single parameter of the Poisson distribution and must be a positive value ($\lambda > 0$).
- $e$: The base of the natural logarithm, a mathematical constant approximately equal to 2.71828. (Many calculators have an $e^x$ function).
- $k!$: The factorial of $k$, which is the product of all positive integers from 1 to $k$ ($k! = k \times (k-1) \times \dots \times 2 \times 1$). Note that $0!$ is defined as 1.
Properties Related to PMF
The Poisson PMF satisfies the necessary properties for a discrete probability distribution:
- Non-negativity: For any $k \ge 0$ and $\lambda > 0$, the values $e^{-\lambda}$, $\lambda^k$, and $k!$ are all positive. Therefore, the probability $P(X=k) = \frac{e^{-\lambda} \lambda^k}{k!}$ is always non-negative:
$$P(X=k) \ge 0 \quad \text{for all } k \ge 0$$
... (2)
- Summation to One: The sum of the probabilities of all possible values of $k$ must equal 1. This confirms that the distribution covers all possible numbers of occurrences.
$$\sum_{k=0}^{\infty} P(X=k) = \sum_{k=0}^{\infty} \frac{e^{-\lambda} \lambda^k}{k!}$$
... (iii)
We can factor out $e^{-\lambda}$ from the sum:
$$ = e^{-\lambda} \sum_{k=0}^{\infty} \frac{\lambda^k}{k!}$$
... (iv)
The infinite series $\sum_{k=0}^{\infty} \frac{\lambda^k}{k!}$ is the well-known Taylor (Maclaurin) series expansion for $e^\lambda$.
$$e^\lambda = \sum_{k=0}^{\infty} \frac{\lambda^k}{k!}$$
... (v)
Substitute this into (iv):
$$ = e^{-\lambda} (e^\lambda) = e^{(-\lambda + \lambda)} = e^0 = 1$$
... (vi)
$$\sum_{k=0}^{\infty} P(X=k) = 1$$
... (3)
Example
Example 1. On average, a call center receives 3 calls per hour. Assuming the calls follow a Poisson distribution, what is the probability that the call center receives exactly 2 calls in a randomly selected hour?
Answer:
Given: The number of calls per hour follows a Poisson distribution. Average rate $\lambda = 3$ calls per hour. Desired number of calls $k=2$.
To Find: The probability of receiving exactly 2 calls in an hour, $P(X=2)$.
Solution:
We are given $\lambda = 3$ and $k = 2$. We use the Poisson PMF formula $P(X=k) = \frac{e^{-\lambda} \lambda^k}{k!}$ (Formula 1).
$$P(X=2) = \frac{e^{-3} 3^2}{2!}$$
... (vii)
Calculate $3^2 = 9$ and $2! = 2 \times 1 = 2$.
$$P(X=2) = \frac{e^{-3} \times 9}{2}$$
$$P(X=2) = \frac{9 e^{-3}}{2}$$
... (viii)
To get a numerical value, we use the approximation $e^{-3} \approx 0.049787$.
$$P(X=2) \approx \frac{9 \times 0.049787}{2}$$
$$P(X=2) \approx \frac{0.448083}{2}$$
$$P(X=2) \approx 0.2240415$$
$$P(X=2) \approx 0.2240$$
(rounded to four decimal places) ... (ix)
The probability of receiving exactly 2 calls in a randomly selected hour is $\frac{9e^{-3}}{2}$ or approximately 0.2240.
Example 2. Using the information from Example 1 ($\lambda=3$), what is the probability of receiving at most 1 call in an hour?
Answer:
Given: Poisson distribution with $\lambda=3$.
To Find: The probability of receiving at most 1 call, $P(X \le 1)$.
Solution:
Receiving "at most 1 call" means the number of calls is either 0 or 1. So, we need to find the sum of the probabilities $P(X=0)$ and $P(X=1)$. Since these are mutually exclusive events, $P(X \le 1) = P(X=0) + P(X=1)$.
Using the Poisson PMF $P(X=k) = \frac{e^{-\lambda} \lambda^k}{k!}$ with $\lambda=3$:
Calculate $P(X=0)$: (Here $k=0$)
$$P(X=0) = \frac{e^{-3} 3^0}{0!} = \frac{e^{-3} \times 1}{1}$$
($3^0=1, 0!=1$)
$$P(X=0) = e^{-3}$$
... (x)
Using $e^{-3} \approx 0.049787$, $P(X=0) \approx 0.0498$.
... (xi)
Calculate $P(X=1)$: (Here $k=1$)
$$P(X=1) = \frac{e^{-3} 3^1}{1!} = \frac{e^{-3} \times 3}{1}$$
($3^1=3, 1!=1$)
$$P(X=1) = 3e^{-3}$$
... (xii)
Using $e^{-3} \approx 0.049787$, $P(X=1) \approx 3 \times 0.049787 \approx 0.149361$.
... (xiii)
$$P(X=1) \approx 0.1494$$
(rounded to four decimal places) ... (xiv)
Add the probabilities:
$$P(X \le 1) = P(X=0) + P(X=1)$$
... (xv)
$$P(X \le 1) = e^{-3} + 3e^{-3} = 4e^{-3}$$
... (xvi)
$$P(X \le 1) \approx 0.0498 + 0.1494$$
... (xvii)
$$P(X \le 1) \approx 0.1992$$
... (xviii)
The probability of receiving at most 1 call in an hour is $4e^{-3}$ or approximately 0.1992.
Mean and Variance of Poisson Distribution
A notable and defining property of the Poisson distribution $X \sim \text{Poisson}(\lambda)$ is the simple relationship between its mean and its variance. Both are equal to the single parameter, $\lambda$.
Mean (Expected Value) of a Poisson Distribution
The mean or expected value of a Poisson random variable $X$, denoted by $E(X)$ or $\mu$, is equal to the distribution's parameter $\lambda$. This aligns with the interpretation of $\lambda$ as the average rate or average number of occurrences in the specified interval.
Formula for the Mean of a Poisson Distribution:
$$E(X) = \mu = \lambda$$
... (1)
Derivation Outline: The derivation uses the definition of expected value $E(X) = \sum_{k=0}^{\infty} k \cdot P(X=k)$ and the Poisson PMF $P(X=k) = \frac{e^{-\lambda} \lambda^k}{k!}$.
$$E(X) = \sum_{k=0}^{\infty} k \frac{e^{-\lambda} \lambda^k}{k!}$$
The term for $k=0$ is $0 \cdot P(X=0) = 0$, so we can start the sum from $k=1$. Also, $k! = k \cdot (k-1)!$ for $k \ge 1$.
$$E(X) = \sum_{k=1}^{\infty} k \frac{e^{-\lambda} \lambda^k}{k(k-1)!} = \sum_{k=1}^{\infty} \frac{e^{-\lambda} \lambda^k}{(k-1)!}$$
Factor out $e^{-\lambda}$ and $\lambda$ (from $\lambda^k = \lambda \cdot \lambda^{k-1}$):
$$E(X) = \lambda e^{-\lambda} \sum_{k=1}^{\infty} \frac{\lambda^{k-1}}{(k-1)!}$$
Let $j = k-1$. As $k$ goes from 1 to $\infty$, $j$ goes from 0 to $\infty$.
$$E(X) = \lambda e^{-\lambda} \sum_{j=0}^{\infty} \frac{\lambda^{j}}{j!}$$
The sum $\sum_{j=0}^{\infty} \frac{\lambda^j}{j!}$ is the Taylor series expansion for $e^\lambda$.
$$E(X) = \lambda e^{-\lambda} (e^\lambda) = \lambda e^{(-\lambda + \lambda)} = \lambda e^0 = \lambda \cdot 1 = \lambda$$
(Mean Derived)
Variance of a Poisson Distribution
The variance of a Poisson random variable $X$ with parameter $\lambda$ is equal to its mean, which is $\lambda$.
Formula for the Variance of a Poisson Distribution:
$$Var(X) = \sigma^2 = \lambda$$
... (2)
Derivation Outline: The derivation uses the formula $Var(X) = E(X^2) - [E(X)]^2$. We know $E(X) = \lambda$. We need to calculate $E(X^2) = \sum_{k=0}^{\infty} k^2 P(X=k) = \sum_{k=0}^{\infty} k^2 \frac{e^{-\lambda} \lambda^k}{k!}$. This sum is evaluated using algebraic manipulation, often involving rewriting $k^2$ as $k(k-1) + k$. It can be shown that $E(X^2) = \lambda^2 + \lambda$. Substituting this into the variance formula gives $Var(X) = (\lambda^2 + \lambda) - (\lambda)^2 = \lambda^2 + \lambda - \lambda^2 = \lambda$.
Standard Deviation
The standard deviation ($\sigma$) of a Poisson random variable is the positive square root of its variance.
Formula for the Standard Deviation of a Poisson Distribution:
$$SD(X) = \sigma = \sqrt{\lambda}$$
... (3)
Key Property: Mean = Variance
A notable and sometimes surprising property of the Poisson distribution is that its **mean and variance are equal** ($E(X) = Var(X) = \lambda$). This unique characteristic is a defining feature of the Poisson distribution and can be a useful indicator for whether observed count data might be well-modeled by a Poisson process (by comparing the sample mean and sample variance of the data).
Example
Example 1. The number of accidents per week on a certain highway follows a Poisson distribution with a mean of 1.5 accidents per week. Find the variance and standard deviation of the number of accidents per week.
Answer:
Given: The number of accidents per week follows a Poisson distribution with a mean of 1.5.
To Find: The variance and standard deviation.
Solution:
For a Poisson distribution, the parameter $\lambda$ is equal to the mean. We are given the mean $\mu = E(X) = 1.5$ accidents per week.
So, the Poisson parameter is $\lambda = 1.5$.
Calculate Variance:
For a Poisson distribution, the variance is equal to $\lambda$ (Formula 2).
$$Var(X) = \sigma^2 = \lambda$$
... (iv)
$$Var(X) = 1.5$$
... (v)
The variance is 1.5 (in units of accidents² per week²).
Calculate Standard Deviation:
The standard deviation is the positive square root of the variance (Formula 3).
$$SD(X) = \sigma = \sqrt{Var(X)} = \sqrt{\lambda}$$
... (vi)
$$\sigma = \sqrt{1.5}$$
... (vii)
Calculating the numerical value:
$$\sqrt{1.5} \approx 1.22474$$
$$\sigma \approx 1.225$$
(rounded to three decimal places) ... (viii)
The standard deviation is $\sqrt{1.5}$ or approximately 1.225 accidents per week.
Poisson Distribution as a Limiting Case of Binomial Distribution
Relationship between Binomial and Poisson Distributions
The Poisson distribution is closely related to the Binomial distribution. In fact, the Poisson distribution can be formally derived as the **limiting case** of the Binomial distribution under specific conditions. This means that under certain circumstances, the probabilities calculated using the Binomial formula will become very close to the probabilities calculated using the Poisson formula.
Conditions for the Limiting Case (Poisson Approximation to Binomial)
The Binomial distribution $B(n, p)$, which describes the number of successes in $n$ trials with success probability $p$, can be approximated by the Poisson distribution $Poisson(\lambda)$ when the following three conditions are met simultaneously:
- **The number of trials $n$ is very large.** Mathematically, this is expressed as $n \to \infty$.
- **The probability of success $p$ on each trial is very small.** Mathematically, this is expressed as $p \to 0$.
- **The product $np$ remains constant and finite.** The mean (expected number of successes) of the binomial distribution, $E(X) = np$, must converge to a finite positive value. This value becomes the parameter $\lambda$ of the Poisson distribution. $np = \lambda$.
When these conditions are met, the probability of getting exactly $k$ successes in a binomial experiment, $P(X=k) = \binom{n}{k} p^k (1-p)^{n-k}$, can be shown mathematically to approach the Poisson probability $\frac{e^{-\lambda} \lambda^k}{k!}$ as $n \to \infty$ and $p \to 0$ while $np=\lambda$.
$$\lim_{\substack{n \to \infty \\ p \to 0 \\ np = \lambda}} \binom{n}{k} p^k (1-p)^{n-k} = \frac{e^{-\lambda} \lambda^k}{k!}$$
... (1)
Practical Application (Poisson Approximation to Binomial)
This limiting relationship is not just theoretical; it has practical implications. In real-world scenarios where the conditions ($n$ large, $p$ small) are approximately met, we can use the Poisson distribution as an approximation to the Binomial distribution. This approximation is useful because calculating binomial probabilities with very large $n$ can be computationally complex (involving large factorials and powers), whereas the Poisson formula is often simpler.
As practical guidelines for when the Poisson approximation to the binomial is generally considered reasonable:
- $n$ is large (e.g., $n \ge 20$, often better if $n \ge 50$ or $n \ge 100$).
- $p$ is small (e.g., $p \le 0.1$, often better if $p \le 0.05$).
- The mean $\lambda = np$ is moderate (e.g., $np < 5$, sometimes up to $np < 10$). The approximation accuracy decreases as $np$ increases for a fixed $p$. However, if $p$ is extremely small, the approximation can be good even for larger $\lambda$.
When these conditions hold, the number of successes in the binomial experiment can be approximated by a Poisson random variable with parameter $\lambda = np$. The probability of getting exactly $k$ successes is approximately $P(X=k) \approx \frac{e^{-\lambda} \lambda^k}{k!}$ with $\lambda=np$.
Example
Example 1. Suppose that 0.5% of the fuses produced by a large factory are defective. If a random sample of 400 fuses is taken, find the approximate probability that exactly 3 of them are defective.
Answer:
Given: Percentage of defective fuses, sample size, desired number of defective fuses.
To Find: Approximate probability of exactly 3 defective fuses.
Solution:
This problem fits the structure of a binomial experiment:
- Number of trials ($n$): The sample size is 400 fuses, so $n=400$.
- Two outcomes per trial: A fuse is either defective (Success) or not defective (Failure).
- Constant probability of success: The probability of a fuse being defective is 0.5%. So, $p = 0.5\% = \frac{0.5}{100} = 0.005$. (Assuming this probability is constant for each fuse and they are independent).
- Independent trials: Assuming the defect status of one fuse does not affect others.
The random variable $X$, the number of defective fuses in the sample, follows a binomial distribution $X \sim B(400, 0.005)$. We want to find $P(X=3)$. The exact binomial calculation would be $P(X=3) = \binom{400}{3} (0.005)^3 (0.995)^{397}$. This is computationally challenging.
Let's check if the conditions for the Poisson approximation to the binomial distribution are met:
- $n=400$, which is a large number ($n \ge 20$ is a common guideline).
- $p=0.005$, which is a small probability ($p \le 0.1$ is a common guideline).
Since $n$ is large and $p$ is small, the Poisson distribution can be used to approximate the binomial distribution. The parameter $\lambda$ for the approximating Poisson distribution is the mean of the binomial distribution, $\lambda = np$.
$$\lambda = np = 400 \times 0.005$$
... (ii)
$$\lambda = 2$$
... (iii)
The mean number of defective fuses in a sample of 400 is 2. This is a moderate value for $\lambda$.
We now approximate $P(X=3)$ using the Poisson probability formula with $\lambda=2$ and $k=3$.
$$P(X=k) \approx \frac{e^{-\lambda} \lambda^k}{k!}$$
... (iv)
Substitute $\lambda=2$ and $k=3$:
$$P(X=3) \approx \frac{e^{-2} 2^3}{3!}$$
... (v)
Calculate $2^3 = 8$ and $3! = 3 \times 2 \times 1 = 6$.
$$P(X=3) \approx \frac{e^{-2} \times 8}{6}$$
$$P(X=3) \approx \frac{8 e^{-2}}{6} = \frac{4 e^{-2}}{3}$$
... (vi)
To get a numerical value, we use the approximation $e^{-2} \approx 0.135335$.
$$P(X=3) \approx \frac{4 \times 0.135335}{3}$$
$$P(X=3) \approx \frac{0.54134}{3}$$
$$P(X=3) \approx 0.180446...$$
$$P(X=3) \approx 0.1804$$
(rounded to four decimal places) ... (vii)
The approximate probability that exactly 3 fuses are defective in a sample of 400 is $\frac{4e^{-2}}{3}$ or about 0.1804.